skip to main content


Search for: All records

Creators/Authors contains: "Chen, Xinyi"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. ABSTRACT

    We present a cosmic density field reconstruction method that augments the traditional reconstruction algorithms with a convolutional neural network (CNN). Following previous work, the key component of our method is to use the reconstructed density field as the input to the neural network. We extend this previous work by exploring how the performance of these reconstruction ideas depends on the input reconstruction algorithm, the reconstruction parameters, and the shot noise of the density field, as well as the robustness of the method. We build an eight-layer CNN and train the network with reconstructed density fields computed from the Quijote suite of simulations. The reconstructed density fields are generated by both the standard algorithm and a new iterative algorithm. In real space at z = 0, we find that the reconstructed field is 90 per cent correlated with the true initial density out to $k\sim 0.5 \, \mathrm{ h}\, \rm {Mpc}^{-1}$, a significant improvement over $k\sim 0.2 \, \mathrm{ h}\, \rm {Mpc}^{-1}$ achieved by the input reconstruction algorithms. We find similar improvements in redshift space, including an improved removal of redshift space distortions at small scales. We also find that the method is robust across changes in cosmology. Additionally, the CNN removes much of the variance from the choice of different reconstruction algorithms and reconstruction parameters. However, the effectiveness decreases with increasing shot noise, suggesting that such an approach is best suited to high density samples. This work highlights the additional information in the density field beyond linear scales as well as the power of complementing traditional analysis approaches with machine learning techniques.

     
    more » « less
  2. The targeted insertion and stable expression of a large genetic payload in primary human cells demands methods that are robust, efficient and easy to implement. Large payload insertion via retroviruses is typically semi-random and hindered by transgene silencing. Leveraging homology-directed repair to place payloads under the control of endogenous essential genes can overcome silencing but often results in low knock-in efficiencies and cytotoxicity. Here we report a method for the knock-in and stable expression of a large payload and for the simultaneous knock-in of two genes at two endogenous loci. The method, which we named CLIP (for 'CRISPR for long-fragment integration via pseudovirus'), leverages an integrase-deficient lentivirus encoding a payload flanked by homology arms and 'cut sites' to insert the payload upstream and in-frame of an endogenous essential gene, followed by the delivery of a CRISPR-associated ribonucleoprotein complex via electroporation. We show that CLIP enables the efficient insertion and stable expression of large payloads and of two difficult-to-express viral antigens in primary T cells at low cytotoxicity. CLIP offers a scalable and efficient method for manufacturing engineered primary cells. 
    more » « less
    Free, publicly-accessible full text available May 1, 2024
  3. Abstract Background Nucleomorphs are remnants of secondary endosymbiotic events between two eukaryote cells wherein the endosymbiont has retained its eukaryotic nucleus. Nucleomorphs have evolved at least twice independently, in chlorarachniophytes and cryptophytes, yet they have converged on a remarkably similar genomic architecture, characterized by the most extreme compression and miniaturization among all known eukaryotic genomes. Previous computational studies have suggested that nucleomorph chromatin likely exhibits a number of divergent features. Results In this work, we provide the first maps of open chromatin, active transcription, and three-dimensional organization for the nucleomorph genome of the chlorarachniophyte Bigelowiella natans . We find that the B. natans nucleomorph genome exists in a highly accessible state, akin to that of ribosomal DNA in some other eukaryotes, and that it is highly transcribed over its entire length, with few signs of polymerase pausing at transcription start sites (TSSs). At the same time, most nucleomorph TSSs show very strong nucleosome positioning. Chromosome conformation (Hi-C) maps reveal that nucleomorph chromosomes interact with one other at their telomeric regions and show the relative contact frequencies between the multiple genomic compartments of distinct origin that B. natans cells contain. Conclusions We provide the first study of a nucleomorph genome using modern functional genomic tools, and derive numerous novel insights into the physical and functional organization of these unique genomes. 
    more » « less
  4. null (Ed.)
  5. null (Ed.)
  6. null (Ed.)
    Resource disaggregation is a new architecture for data centers in which resources like memory and storage are decoupled from the CPU, managed independently, and connected through a high-speed network. Recent work has shown that although disaggregated data centers (DDCs) provide operational benefits, applications running on DDCs experience degraded performance due to extra network latency between the CPU and their working sets in main memory. DBMSs are an interesting case study for DDCs for two main reasons: (1) DBMSs normally process data-intensive workloads and require data movement between different resource components; and (2) disaggregation drastically changes the assumption that DBMSs can rely on their own internal resource management. We take the first step to thoroughly evaluate the query execution performance of production DBMSs in disaggregated data centers. We evaluate two popular open-source DBMSs (MonetDB and PostgreSQL) and test their performance with the TPC-H benchmark in a recently released operating system for resource disaggregation. We evaluate these DBMSs with various configurations and compare their performance with that of single-machine Linux with the same hardware resources. Our results confirm that significant performance degradation does occur, but, perhaps surprisingly, we also find settings in which the degradation is minor or where DDCs actually improve performance. 
    more » « less
  7. null (Ed.)
  8. Many graph problems can be solved using ordered parallel graph algorithms that achieve significant speedup over their unordered counterparts by reducing redundant work. This paper introduces a new priority-based extension to GraphIt, a domain-specific language for writing graph applications, to simplify writing high-performance parallel ordered graph algorithms. The extension enables vertices to be processed in a dynamic order while hiding low-level implementation details from the user. We extend the compiler with new program analyses, transformations, and code generation to produce fast implementations of ordered parallel graph algorithms. We also introduce bucket fusion, a new performance optimization that fuses together different rounds of ordered algorithms to reduce synchronization overhead, resulting in 1.2x--3x speedup over the fastest existing ordered algorithm implementations on road networks with large diameters. With the extension, GraphIt achieves up to 3x speedup on six ordered graph algorithms over state-of-the-art frameworks and hand-optimized implementations (Julienne, Galois, and GAPBS) that support ordered algorithms. 
    more » « less